Lecture speech recognition by combining word graphs of various acoustic models

نویسندگان

Tetsuo Kosaka

Keisuke Goto

Takashi Ito

Masaharu Katoh

چکیده

The aim of this work is to improve the performance of lecture speech recognition by using a system combination approach. In this paper, we propose a new combination technique in which various types of acoustic models are combined. In the combination approach, the use of complementary information is important. In order to prepare acoustic models that incorporate a variety of acoustic features, we employ both continuous-mixture hidden Markov models (CMHMMs) and discrete-mixture hidden Markov models (DMHMMs). These models have different patterns of recognition errors. In addition, we propose a new maximum mutual information (MMI) estimation of the DMHMM parameters. In order to evaluate the performance of the proposed method, we conduct recognition experiments on “Corpus of Spontaneous Japanese.” In the experiments, a combination of CMHMMs and DMHMMs whose parameters were estimated by using the MMI criterion exhibited the best recognition performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Allophone-based acoustic modeling for Persian phoneme recognition

Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...

متن کامل

Spontaneous speech recognition using a massively parallel decoder

Since spontaneous utterances include many variations, speakerand task-independent general models do not work well. This paper proposes combining cluster-based language and acoustic models based on the framework of Massively Parallel Decoder (MPD). The MPD is a parallel decoder that has a large number of decoding units, in which each unit is assigned to each combination of element models. It run...

متن کامل

A back-off discriminative acoustic model for automatic speech recognition

In this paper we propose a back-off discriminative acoustic model for Automatic Speech Recognition (ASR). We use a set of broad phonetic classes to divide the classification problem originating from context-dependent modeling into a set of subproblems. By appropriately combining the scores from classifiers designed for the sub-problems, we can guarantee that the back-off acoustic score for diff...

متن کامل

A hybrid SVM/HMM acoustic modeling approach to automatic speech recognition

Acoustic models based on a NN/HMM framework have been used successfully on various recognition tasks for continuous speech recognition. Recently tied-posteriors have been introduced within this context. Here, we present an approach combining SVMs and HMMs using the tied-posteriors idea. One set of SVMs calculates class posterior probabilities and shares these probabilities among all HMMs. The n...

متن کامل

Interfacing acoustic models with natural language processing systems

The research presented here focuses on implementation and efficiency issues associated with the use of word graphs for interfacing acoustic speech recognition systems with natural language processing systems. The effectiveness of various pruning methods for graph construction is examined, as well as techniques for word graph compression. In addition, the word graph representation is compared to...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

Lecture speech recognition by combining word graphs of various acoustic models

نویسندگان

چکیده

منابع مشابه

Allophone-based acoustic modeling for Persian phoneme recognition

Spontaneous speech recognition using a massively parallel decoder

A back-off discriminative acoustic model for automatic speech recognition

A hybrid SVM/HMM acoustic modeling approach to automatic speech recognition

Interfacing acoustic models with natural language processing systems

عنوان ژورنال:

اشتراک گذاری